26 research outputs found
Improved multimedia server I/O subsystems
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.---- Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.The main function of a continuous media server is to concurrently stream data from storage to multiple clients over a network. The resulting streams will congest the host CPU bus, reducing access to the system's main memory, which degrades CPU performance. The purpose of this paper is to investigate ways of improving I/O subsystems of continuous media sewers. Several improved I/O subsystem architectures are presented and their performances evaluated. The proposed architectures use an existing device, namely the Intel i960RP processor. The objective of using an I/O processor is to move the stream and its control from the host processor and the main memory. The ultimate aim is to identify the requirements for an integrated I/O subsystem for a high performance scalable media-on-demand server
An improved parallel archutecture for MPEG-4 motion estimation in 3G mobile applications
A high-parallel VLSI core architecture for MPEG-4 motion estimation is proposed in this paper. It possesses the characteristics of low memory bandwidth and low clock rate requirements, thus primarily aiming at 3G mobile applications. Based on a one-dimensional tree architecture, the architecture employs the dual-register/buffer technique to reduce the preload and alignment cycles. As an example, full-search block matching algorithm has been mapped onto this architecture using a 16-PE array that has the ability to calculate the motion vectors of QCIF video sequences in real time at 1 MHz clock rate and using 15.5 Mbytes/s memory bandwidth